Shallow morphology based complex predicates extraction in Oriya

نویسندگان

  • R. C. Balabantaray
  • M. K. Jena
  • S. Mohanty
  • Nitin Indurkhya
  • Fred J. Damerau
  • Richard Beckwith
  • Christiane Fellbaum
  • Derek Gross
  • Katherine Miller
  • Binod Bihari
چکیده

This paper presents the extraction of Complex Predicates (CPs) in Oriya based on shallow morphology and available seed lists of verbs. Generally Oriya language is a free word order language. Free word order languages have relatively unrestricted local word group or phrase structures that make the problem of complex predicates extraction quite challenging. The complex predicates are generally the special multi word expression which is extracted with a special emphasis on compound verbs (Verb + Verb) and conjunct verbs (Noun /Adjective +Verb)/ (Verb + Noun /Adjective). The lexicalization of compound and conjunct verbs is done based on the information of shallow morphology. Lexical scopes of compound and conjunct verbs in consecutive sequence of Complex Predicates (CPs) have been identified. Aim of the current work is, to investigate the possibility of improving the accuracy of complex predicates extraction making it sensitive to verb sub categorization and to evaluate the recall, precision and Fscore on different operational environment.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Extraction of Complex Predicates in Bengali

This paper presents the automatic extraction of Complex Predicates (CPs) in Bengali with a special focus on compound verbs (Verb + Verb) and conjunct verbs (Noun /Adjective + Verb). The lexical patterns of compound and conjunct verbs are extracted based on the information of shallow morphology and available seed lists of verbs. Lexical scopes of compound and conjunct verbs in consecutive sequen...

متن کامل

The Interlanguage of Persian Learners of Italian: a Focus on Complex Predicates

This paper aims at investigating the acquisition of Italian complex predicates by native speakers of Persian. Complex predication is not as pervasive a phenomenon in Italian as it is in Persian. Yet Italian native speakers use complex predicates productively; spontaneous data show that Persian learners of Italian seem to be perfectly aware of Italian complex predicates and use this familiar fea...

متن کامل

Zone Based Relative Density Feature Extraction Algorithm for Unconstrained Handwritten Numeral Recognition

The recognition of handwritten digit recognition has been a challenging problem among the researchers for few decades. This paper proposes a relative density feature extraction algorithm for recognizing unconstrained single connected handwritten numerals independent of the languages. The proposed method consists of four phases, namely, image enhancement (dilation), representation (zone based), ...

متن کامل

A New Method for Improving Computational Cost of Open Information Extraction Systems Using Log-Linear Model

Information extraction (IE) is a process of automatically providing a structured representation from an unstructured or semi-structured text. It is a long-standing challenge in natural language processing (NLP) which has been intensified by the increased volume of information and heterogeneity, and non-structured form of it. One of the core information extraction tasks is relation extraction wh...

متن کامل

Evaluation of the NLP Components of an Information Extraction System for German

This paper describes ongoing work on the evaluation of the NLP components of the core engine of smes (Saarbrücker Message Extraction System), which consists of a tokenizer, an efficient and robust German morphology, a part-of-speech (POS) tagger, a shallow parsing module, a linguistic knowledge base and an output construction component. Currently the morphology, the tagger and a parsing module ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011